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EDWIN  HUTCHINS 


Computers  are  the  most  plastic  medium  ever  invented  for  the  representation  and  propagation  of  infor¬ 
mation.  In  fact,  they  are  so  adaptable  and  can  manifest  such  a  wide  range  of  behaviors,  that  little  but 
the  hardware  itself  may  be  easily  identifiable  as  an  enduring  property  of  the  device.  Computers  can 
mimic  the  behaviors  of  other  information  media  and  can  manifest  behaviors  that  are  simply  not  possible 
in  any  other  medium.  We  might  speak  literally  about  the  nature  of  the  computer’s  behavior  (to  the 
extent  we  can  spe»k  literally  about  anything)  at  a  very  low  level,  describing  the  changes  in  the  states  of 
silicon  gates  and  so  on,  but  even  there  we  frequently  resort  to  metaphors.  As  the  levels  of  complexity 
are  layered  one  atop  the  other  to  produce  the  high-level  behaviors  that  are  die  actions  we  recognize 
while  interacting  with  the  computer,  the  possibility  of  talking  or  thinking  literally  about  the  computer’s 
behavior  vanishes.  We  deal  with  this  complexity  and  this  plasticity  by  speaking  metaphorically  about 
the  behavior  of  the  computer.  The  metaphors  we  use  both  intentionally  and  unintentionally,  contribute 
structure  in  terms  of  which  we  organize  our  understandings  of  what  is  going  on  X-akoff  &  Johnson, 
1980).  My  machine,  for  example,  "reads,  writes,  copies,  and  edits"  files,  "flushes"  buffers,  "creates, 
refreshes,  kills,  and  buries"  windows,  "arrests"  processes,  "inspects,  describes,  and  sends  messages  to" 
objects,  "calls  and  traces"  functions,  and  a  great  deal  more.  I  would  have  little  hope  of  under  landing 
what  the  machine  can  do  if  1  did  not  have  a  sense  of  what  sorts  of  "things"  exi,t  in  my  machine  and 
what  sorts  of  activities  those  things  engage  in.  This  sense  is  provided,  in  large  part,  by  an  extensive  set 
of  metaphors. 


TYPES  OF  INTERFACE  METAPHOR 


Metaphors  are  applied  to  virtually  all  levels  uf  system  behavior.  System  designers  use  metaphors 
when  thinking  about  their  designs,  and  in  this  way,  metaphors  may  shape  the  design  process.  The 
metaphors  also  provide  a  language  within  the  design  community  that  designers  use  to  communicate 
their  designs  to  each  other.  Some,  like  "reading"  and  "writing"  are  thoroughly  entrenched  in  the  culture 
of  computer  design.  Metaphors  reach  the  user  community  as  ways  of  talking  about  the  behavior  of  the 
system  and  here  they  provide  the  users  with  resources  for  thinking  about  what  the  machine  is  doing. 
The  importance  of  metaphors  in  the  presentation  of  computer  systems  is  revealed  by  the  rate  at  which 
metaphors  are  being  registered  as  trademarks  in  the  current  highly  competitive  computer  marketplace. 
Of  course,  users  do  not  necessarily  understand  a  system  the  way  it  is  understood  by  designers  and 
marketing  analysts.  Users  must  invent  their  own  interpretations  of  the  metaphors  and  discover  the  lim¬ 
its  of  the  mapping  of  the  metaphor  onto  the  behavior  of  the  system.  Users  sometimes  even  invent  their 
own  metapnors  as  a  means  of  coming  to  terms  with  the  behavior  of  a  system.  Metaphors  are,  therefore, 
not  fundamental  properties  of  the  system  behavior  per  se.  They  are,  instead,  ways  of  understanding  the 
system’s  behavior.  However,  as  a  convenience,  I  shall  use  the  names  of  particular  metaphors  to  refer  to 
interfaces  that  were  designed  in  accordance  with  or  are  well  conceived  in  terms  of  that  metaphor. 
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There  are  at  least  three  distinguishable  types  of  metaphor  describing  various  aspects  of  human* 
computer  interface  design. 

•  Activity  metaphors.  These  refer  to  the  user's  highest  level  goals  or  to  the  institutional  goals 
that  are  held  for  the  user  whether  the  user  shares  them  or  not.  Activity  metaphors  structure 
expectations  or  intentions  with  respect  to  the  outcome  of  the  Interaction.  Is  the  user  playing  a 
game?  Designing  an  artifact?  Communicating  with  other  humans?  Controlling  a  process? 

•  Mode  of  interaction  metaphors.  The  reference  to  "dialogue"  in  the  title  of  this  workshop  and 
many  of  its  papers  is  an  example  of  the  nse  of  a  mode  of  interaction  metaphor.  These  meta¬ 
phors  organise  understandings  about  the  nature  of  the  interaction  with  the  computer.  Mode  of 
interaction  metaphors  concern  the  relationship  between  the  user  and  the  computer  without 
regard  for  the  particular  task  the  user  is  attempting  to  accomplish  via  the  computer.  The 
choice  of  metaphor  at  this  level  determines  what  sort  of  thing  the  user  thinks  the  computer  is. 
Is  it  a  conversational  partner?  An  environment  for  action?  A  tool  box  and  materials  shed? 

•  Task  domain  metaphors.  Task  domain  metaphors  provide  the  user  with  a  structure  for  under¬ 
standing  the  nature  of  particular  tasks  as  presented  by  the  computer.  A  common  metaphor  for 
the  management  of  information  stored  in  computers,  for  example,  is  the  "file”  system  meta¬ 
phor.  The  user  can  behave  as  if  information  is  stored  in  files  that  have  properties  something 
like  those  of  paper  files  stored  in  a  file  cabinet  The  computer  provides  a  set  of  file  manipula¬ 
tion  operations  that  may  have  analogues  in  the  operations  one  performs  on  paper  files. 
Material  can  be  added  to  or  deleted  from  the  files,  new  files  can  be  created,  files  can  be 
removed  from  the  file  system,  and  so  on.  Editors,  mail  programs,  terminal  emulators, 
debuggers,  and  <  •  her  application  packages  are  built  on  task  domain  metaphors  that  give  coher¬ 
ence  to  the  activities  they  support  Each  defines  the  objects  and  the  operations  that  exist  in  the 
task  domain,  and  each  hopefully  provides  a  structure  that  is  easily  mappable  onto  the  behaviors 
of  the  system. 

There  is  some  independence  between  these  types  of  metaphor.  The  operations  on  files  provided  under 
the  file  manipulation  metaphor  could  be  invoked  under  any  of  several  mode  of  interaction  metaphors. 
The  user  might  specify  an  action  to  be  taken  on  a  file,  for  example,  by  describing  the  acdon  conversa¬ 
tionally,  by  manipulating  controls  that  cause  the  action  to  happen,  by  issuing  a  command  to  execute  the 
action,  or  by  performing  in  some  other  mode  of  interaction.  There  are  also  constraints  among  these 
types  of  metaphor.  Some  mode  of  interaction  metaphors,  for  example,  can  only  be  maintained  v^a  die 
creation  of  appropriate  domain  metaphors. 

In  this  paper  I  am  most  concerned  with  metaphors  for  mode  of  interaction.  Primary  attention  will  be 
focused  upon  these  four  (a)  conversation,  (b)  declaration,  (c)  model-world,  and  (d)  collaborative  mani¬ 
pulation.  I  will  show  how  mode  of  interaction  metaphors  are  essential  to  the  user’s  interpretation  of  the 
behavior  of  the  interface,  how  interface  designers,  sometimes  unknowingly,  encourage  particular  meta¬ 
phorical  interpretations  of  the  interfaces  they  design,  and  how  the  choice  of  metaphor  has  important,  but 
often  overlooked,  consequences  for  both  the  designers  and  the  users  of  interfaces. 


THE  CONVERSATION  METAPHOR 

The  metaphor  of  user  and  computer  engaged  in  a  conversation  with  each  other  or  carrying  on  a 
dialogue  about  the  task  at  hand  is  the  most  popular  of  the  mode  of  interaction  metaphors  for  human- 
computer  interfaces.  This  metaphor  seems  to  be  based  upon  a  structure  of  assumptions  that  goes  some¬ 
thing  like  this: 
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1.  Hie  problem  of  human-computer  interfaces  is  a  communication  problem.  In  order  to  work  with 
each  other,  the  user  and  the  computer  must  communicate. 

2.  Human- to- human  communication  is  carried  out  primarily  by  means  of  conversation. 

3.  Because  humans  already  have  considerable  skills  for  interacting  with  each  other,  making  a  com¬ 
puter  interface  behave  like  a  human  permits  the  human  user  tn  utilize  already  acquired  skills, 
and  that  makes  the  interaction  easier  for  dm  user.  That  is,  human -computer  interfaces  become 
more  usable  the  more  they  mimi:  human-human  interactions. 

4.  Therefore,  human-computer  interfaces  should  support  conversation  between  user  and  computer. 1 

Consider  some  of  the  properties  of  the  con verudon  metaphor.  The  conversation  metaphor  inserts  an 
implied  intermediary  between  the  user  and  the  world  in  which  actions  ire  taken  (see  Figure  1).  In  e 
system  built  on  the  conversation  metaphor,  the  interface  is  e  language  medium  in  which  the  user  and 
the  system  have  e  conversation  about  some  world.  The  interface  is  an  implied  intermediary  between 
the  user  and  the  world  about  which  things  are  said.  In  litany  cases,  the  world  about  which  things  are 
said  is  not  explicitly  represented.  In  such  e  setting,  die  burden  is  on  the  user  to  maintain  a  model  of 
the  state  of  this  unrepresented  world.  This  can  be  e  considerable  burden  and  can  lead  to  many  sorts  of 
errors,  especially  the  attempt  to  carry  out  actions  in  inappropriate  environments.  Alternatively,  it  can 
lead  to  the  user  making  frequent  requests  to  the  intermediary  to  describe  or  report  on  the  relevant 
aspects  of  the  task  environment,  14.,  requesting  1  listing  of  file  names  prior  to  describing  a  file  system 
operation.  On  the  other  hand,  tn  interface  built  on  tin  conversational  metaphor  can  take  full  advantage 
of  the  power  of  abstraction  available  in  symbolic  reference.  The  implied  intermediary  can  be  charged 
with  the  responsibility  of  mapping  user  input  expressions  onto  the  world  of  interest,  enabling  very 
economical  descriptions.  The  popularity  of  the  conversation  metaphor  may  be  due  both  to  die  surface 


FIGURE  1.  Th«  Coe  venation  Intarface.  Hn  Um  umt  hu  •  ooavt nation  with  an  intsnnediary  who  acta  on  the  world  of  action. 
The  000 venation  coosiat*  of  exchanger  of  lytnboiic  descriptions  between  user  end  interface  intermediary. 


1  There  are  good  reasons  to  question  each  of  thsss  assumptions.  I  will  present  Shew  reasons  at  the  close  of  the  paper. 
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credibility  of  the  assumptions  on  which  it  is  baaed  and  so  the  strength  of  the  teletype  legacy.  For  the 
first  three  decades  of  computer  use,  the  teletype  and  its  technological  relatives  have  been  the  primary 
form  of  interface  hardware.  Dealing  as  they  do  in  duracten  and  lines  of  text,  they  naturally  support,  if 
not  a  conversation,  then  at  least  an  exchange  of  character  strings  between  user  end  computer.  Perhaps 
we  can  take  batch  processing  to  be  the  prototype  of  early  human-computer  conversation  with  the  parti¬ 
cipants  taking  very  long  conversational  turns  via  card  reader  and  line  printer.  The  teletype  permitted 
shorter  conversational  turns,  but  it  was  still  interaction  based  on  the  conversation  metaphor  and  it  is  still 
very  low  bandwidth  communication.  In  order  to  get  much  done  through  low  bandwidth  communica¬ 
tion,  one  needs  dense  symbols  in  the  interface  language;  symbols  that  stand  for  complicated  procedures, 
for  example.  This  narrowness  of  bandwidth  encourages  even  more  die  conception  of  the  computer  as  an 
agent  that  cm  interpret  simple  symbols  that  refer  to  complicated  procedures.  Furthermore,  this  meta¬ 
phor  feeds  and  is  fed  by  other  related  metaphors.  I  do  not  know  which  came  first  historically,  the  con¬ 
cept  of  the  computer  as  a  brain,  the  heart  of  artificial  intelligence,  or  the  notion  of  convening  with  it 
Clearly,  each  suggests  die  other,  and  as  either  gains  strength  so  does  the  other. 

Finally,  regardless  of  our  metaphorical  preferences  with  regard  to  mode  of  interaction,  the  fact  is  that 
every  interface  implements  an  interface  language  in  which  the  user  composes  expressions  that  are  sub¬ 
sequently  interpreted  by  the  computer  and  in  which  die  computer  composes  expressions  that  inform  die 
user  of  what  has  happened.  That  seems  like  the  literal  makings  of  a  conversation  no  matter  what  we 
may  think. 

All  of  these  factors  suggest  a  conversational  conception  of  human-computer  interaction.  Yet,  the 
conversational  metaphor  does  not  quite  fit  the  reality  of  most  human-computer  interactions.  Typical 
conversations  on  "conversational"  interfaces  are  very  stilted  in  a  variety  of  ways  discussed  by  other 
papers  in  this  workshop.  For  example,  the  typical  human-machine  conversation  is  conducted  with  a 
limited  partner  via  a  low  bandwidth  channel  using  a  severely  constrained  vocabulary  and  language  syn¬ 
tax.  The  conversing  parties  do  not  mutually  repair  each  other’s  production  errors,  and  of  course,  the 
user’s  conversational  turn  typically  consists  of  typing  rather  than  speaking,  while  the  machine’s  turn 
consists  of  displaying  characters  on  a  screen.  These  discrepancies  between  the  metaphorical  ideal  of 
human-human  conversation  and  the  reality  of  human-computer  conversation  form  a  sort  of  design 
vacuum.  Having  decided  upon  the  desirability  of  die  conversational  metaphor,  that  metaphor  now  pulls 
interface  technology  toward  the  foil  realization  of  the  metaphorical  potential.  If  one  consults  the 
proceedings  of  almost  any  interface  design  conference,  one  will  find  a  host  of  efforts  to  till  this  design 
vacuum.  If  only  we  could  use  natural  language  and  could  speak  our  input  If  only  the  machine  could 
understand  what  we  mean  and  talk  back  to  us.  Then  we  would  have  a  truly  conversational  interface. 
This  is  a  healthy  role  for  a  metaphor,  but  not  one  that  is  usually  considered  when  the  metaphor  is 
suggested. 


BEYOND  CONVERSATION 

Recently,  something  different  has  been  happening  in  interface  design.  With  the  widespread  availabil¬ 
ity  of  new  interface  hardware  including  high-resolution  bitmapped  displays,  pointing  devices,  and  faster 
processors,  a  new  class  of  interface  has  emerged.  Literally  hundreds  of  such  systems  are  now  available 
and  they  appear  to  be  very  popular,  especially  with  casual  users.  It  is  certainly  possible  to  regard  these 
interfaces  using  the  conversational  metaphor.  I  take  references  to  "visual  dialogues,"  "gestural  dialo¬ 
gues,"  "graphical  languages,"  etc.,  to  be  examples  of  the  application  of  the  conversational  metaphor  to 
these  systems.  Schneiderman  (1982,  1983)  coined  the  term  "direct  manipulation"  to  refer  to  these  sys¬ 
tems.  The  technology  on  which  these  systems  are  based  has  actually  been  around  for  more  than  20 
years  (Sutherland,  1963),  but  it  has  only  become  widely  available  in  the  pas',  few  years. 

The  research  group  with  which  I  am  affiliated  has  been  in  the  business  of  building  interfaces  of  this 
type  for  many  years.  Examples  include  a  simulation-based  steam  propulsion  training  system,  Steamer, 
(Hollan,  Hutchins,  &  Weitzman,  1984),  a  graphics  editor  (Hollan,  Hutchins,  McCandless,  Rosens tein,  & 


Weteman,  is  prats),  t  radar  navigation  training  sysma,  and  a  Mind  manipulation"  statistical  analysis 
fbctilty  (Owen,  1986).  Uadi  recently,  bowuver,  wu  havu  not  thought  vary  seriously  about  why  these 
hmrfhoes  work  dm  way  they  do.  We  behave  that  an  understanding  of  dm  cognitive  principles  that 
underlie  their  apparent  usability  will  enable  at  to  buOd  even  better  interfaces. 

Some  reeearetmrs  have  tried  to  identify  "direct  manipulation"  with  a  particular  set  of  interface 
behaviors.  Schneidemun,  tor  example,  uses  direct  manipulation  to  refer  to  systems  having  the  follow¬ 
ing  characteristics: 

1.  Continuous  representation  of  the  objects  of  hnsrett 

2.  Physical  actions  or  labeled  button  presses  instead  of  complex  syntax. 

3.  Rapid  incremental  reversible  opandona  whose  Impact  on  the  object  of  interest  is  immediately 
visible.  (1982,  p.251) 

We  believe  that  e  checklist  la  a  weak  approach  to  understanding  these  interfaces.  Even  if  these  are  die 
Ight  characteristics,  we  would  like  to  know  why  they  ire  good. 

In  an  earlier  paper  (Hutchins,  Hollan,  ft  Norman,  1985),  we  described  two  aspects  of  the  interface 
that  teemed  to  produce  dm  sensation  of  directness  of  action:  distance  and  engagement 

[Distance]  involves  a  relationship  between  dm  task  the  user  his  in  mind  and  die  way  that  task 
can  be  acccomplished  vie  dm  interface.  Here  dm  critical  issues  involve  minimizing  die  effort 
required  to  bridge  the  gulf  between  the  user's  goals  and  the  way  they  must  be  specified  to  the 
system.  (Hutchins,  Hollan,  ft  Norman,  1985,  p.  318) 

We  identified  two  components  of  distance  in  this  gulf,  semantic  distant <r  and  what  I  will  call  here 
reftrtnticd  distant*.1  Figure  2  shows  the  gulf.  Figure  3  shows  the  relationship  between  these  types  of 
distance. 

Semantic  distance  concerns  dm  relationship  between  dm  user's  itenrions  and  die  meanings  of  the 
expressions  that  are  possible  in  dm  interface  language.  It  refers  to  dm  extent  to  which  the  interface 
language  provides  means  of  expressing  the  user's  intentions.  Is  there  a  simple  expression  for  what  one 
intends,  or  is  one  obliged  to  construct  t  lengthy  circumlocution?  High-level  programming  languages 
can  be  seen  as  attempts  to  reduce  semantic  distance  by  providing  dm  user  with  simple  expressions  (e.g., 
function  names)  that  refer  to  frequently  encountered  problem  decompositions. 

Referential  distance  refers  to  dm  extent  that  the  user’s  understanding  of  die  meaning  of  the  expres¬ 
sion  is  similar  to  die  user’s  understanding  of  the  form  of  the  expression.  Symbolic  interfaces,  for  exam¬ 
ple,  are  typically  high  in  referential  distance  because  the  relationships  between  the  forms  of  die 


F1C1URE  2.  The  Gulfs  of  Execution  «d  Evaluation  Each  Gulf  la  uaidlnctiooii:  Tha  Gulf  of  Execution  extends  from  user  goals 
to  system  state;  die  Gulf  of  Evaluation  extends  from  system  Mate  to  user  goals. 


*  la  the  earlier  work  of  Hutchins,  Hollan  and  Norman,  this  concept  was  called  articulatory  distance.  This  name  is  unfortunate  for 
narcos  that  should  become  dear  aa  1  explicate  the  meaning  of  referential  distance. 
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FIGURE  3.  Evtry  upmtiM  in  th«  iituftm  lugiagt  bat  •  iwli|  aad  a  ft ma.  Semantic  dtitaaoa  nflaOi  th*  mUtioothip 
Mmm  lb*  UMr't  iotutioni  and  tha  maiaingi  of  axpratana  la  tba  UMfbet  laaguaga  for  both  input  aad  output  Rafaiaadal  di«- 
taaoa  wflacta  tha  lalatioadxip  betwaaa  Sa  phyatoal  fata  of  tha  axpriaiioa  aad  ita  awaalag.  Tba  aoikr  it  ia  to  gat  ftom  tba  forat  of 
tba  axpnaakai  to  maaaiag,  fta  Moallar  tba  rateaatial  diataaoa. 


expressions  and  their  meanings  are  arbitrary.  We  propoted  a  cognitive  basis  for  this  sensation,  arguing 
that  the  bettor  the  interface  to  a  system  helps  bridge  the  gulf  between  user  intention  and  action,  the  less 
cognitive  effort  needed  and  the  more  direct  the  resulting  feeling  of  interaction. 

Engagement  proved  more  difficult  to  deal  with.  We  felt  that 

The  systems  that  best  exemplify  direct  manipulation  all  give  the  qualitative  feeling  that  one  is 
directly  engaged  wife  fee  control  of  objects — not  wife  the  programs,  not  wife  fee  computer, 
but  wife  the  semantic  objects  of  our  goals  and  intentions,  (Hutchins,  Hollan,  &  Norman,  1985, 
p.  318) 

When  it  came  to  specifying  how  this  sensation  was  to  be  produced,  however,  we  also  resorted  to  a 
checklist,  not  unlike  fee  one  proposed  by  SchnekJerman.  We  did  add  the  condition  feat  fee  interface 
language  should  present  to  fee  user  a  model  world  such  feat  the  objects  of  feat  world  appear  and 
behave  as  though  they  are  fee  objects  of  interest  We  knew  that  fee  model  world  was  important,  but  we 
were  stuck  dunking  about  die  properties  of  fee  interface  language.  In  particular  we  w ere  implicitly 
committed  to  fee  idea  feat  expressions  in  fee  interface  had  "meanings"  feat  were  to  be  interpreted  by 
fee  machine,  in  the  case  of  user  input  expressions,  or  that  were  in  some  sense  intended  by  the  machine, 
in  the  case  of  machine  output  As  a  consequence,  our  discussion  at  feat  time  focused  on  techniques  for 
reducing  referential  distance  by  using  expressions  that  have  nonarbitrary  relations  to  their  referents.  We 
considered  onomatopoea,  iconic  representation,  and  located  fee  power  of  pointing  devices  in  fee  fact 
feat  they  are  "spatio-mimetic."  Wife  fee  exception  of  fee  "spatio-miraetic"  nature  of  pointing  devices, 
these  ideas  are  grounded  in  fee  conversational  metaphor,  and  it  is  not  possible  to  understand  the  power 
of  the  model-world  netaphor  without  shaking  them  off.  At  feat  time  we  failed  to  see  feat  while  as 
observers  and  actors  we  may  certainly  intend  what  we  do  in  the  world  and  interpret  fee  consequences, 
the  world  itself  neither  interprets  our  actions  nor  intends  fee  consequences.  Our  actions  happen  in  the 
world,  but  they  do  not  have  "meanings"  feat  are  interpreted  by  the  world  in  order  to  determine  how  the 
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world  is  affected.*  ITw  notion  of  "die  meaning  of  an  expression"  implies  s  reference  gap;  a  relationship 
between  one  thing  and  some  other  thing  that  it  "represents*  or  "stands  for."  The  reference  gap  in  turn 
IntpUea  an  iataipmsr,  an  agent  that  ean  bridge  the  g^  and  make  a  mapping  horn  symbolic  expression 
to  referent  This  reference  gap  does  not  exist  for  attions  in  the  world,  yet  it  is  a  fundamental  property 
of  symbolic  nUtkms,  and  the  power  of  computers  can  be  traced  to  their  abilities  as  symbol  systems, 

I  want  to  argna  bare  that  the  reasons  for  tht  apparent  usability  of  this  nsw  class  of  interfaces  lie  in 
the  nature  of  tha  nUrionshipa  between  expressions  in  the  interface  Unguage  and  the  things  to  which  the 
expressions  refer.  The  key  to  the  sensation  of  directness  in  these  new  interfaces  is  that  these  new  inter- 
foce  technologies  permit  the  design  of  an  interface  under  the  model-world  metaphor.  By  simulating  a 
world  of  action,  this  metaphor  collapse*  die  symbolic  reference  gap.  This  metaphor  does  not  simply 
reduce  referential  distance,  it  eliminates  hi  Before  we  can  see  how  the  model-world  metaphor  does 
what  it  Joes,  however,  we  need  to  consider  the  nature  and  implications  of  reference  relations  more 
broadly. 


THE  DECLARATION  METAPHOR 

In  the  opening  essay  of  Expression  end  Meaning,  Searie  argues  that 

[Tlhere  are  a  rather  limited  number  of  basic  things  we  do  with  language:  we  tell  people  how 
things  are,  we  try  to  get  them  to  do  things,  we  commit  ourselves  to  doing  things,  we  express 
our  feelings  and  attitudes,  and  we  bring  about  changes  through  our  utterances.  (1979,  p,  29) 

The  first  four  kinds  of  things  we  do  with  language,  essertives,  directives,  commissives,  and  expressives. 
respectively,  are  done  with  descriptions  of  the  work),  but  the  last  thing  on  Searle’s  list,  bringing  about 
changes  through  utterances,  is  different  Searie  has  termed  utterances  that  do  this  declarations.  These 
are  "cases  where  one  brings  s  state  of  affeks  into  existence  by  deciding  it  to  exist,  cases  where,  so  to 
speak,  ‘saying  makes  it  so,"(1979,  p.  Id).  Searie  gives  as  examples  "I  resign,"  "You're  fired,"  "I 
excommunicate  you,"  "I  appoint  you  chairman,"  and  others.  Successful  performance  of  a  declaration 
guarantees  that  die  propositional  content  of  die  utterance  corresponds  to  the  world.  Searie  says, 
"Declarations  bring  about  some  alteration  in  the  status  or  condition  of  the  referred  to  object  or  objects 
solely  in  virtue  of  the  fact  that  the  declaration  has  been  successfully  performed"  (1979,  p.  17).  What 
makes  these  utterances  special  is  their  relation  to  the  world  to  which  they  refer.  Notice  diet  all  the 
objects  referred  to  in  die  declarations  are  culturally  constructed  objects  (D* Andrade,  1981).  Employ¬ 
ment,  membership  in  a  church,  and  the  chair  of  a  meeting  are  all  social  entities.  Each  is  embedded  in  a 
social  arrangement  in  which  it  is  people’s  agreement  that  it  is  so  that  makes  it  so.  They  refer  to  aspects 
of  the  social  world  that  exist  only  by  virtue  of  the  participants  agreeing  that  they  exist  The  agreements 
are  made  and  unmade  by  language  acts.  These  declarations  change  die  world  they  refer  to  by  changing 
the  agreement  under  which  something  does  or  does  not  exist  The  relation  between  the  expression  and 
the  thing  to  which  it  refers  can  therefore  be  causal  rather  than  simply  descriptive.  It  is  the  properties  of 
that  world  that  make  that  causality  possible.  Declarations  are  not  always  successfully  performed,  but 
when  they  are,  they  have  their  effects  because  they  refer  to  a  world  that  can  be  constructed  and  modi¬ 
fied  by  the  performance  of  expressions  in  the  language. 

The  existence  in  natural  language  of  declarations  as  a  class  of  speech  acts  with  this  special  reference 
relation  suggests  that  the  same  reference  relation  could  also  be  supported  by  computer  interfaces  that 
apper.  to  be  based  on  the  conversational  metaphor.  And  in  fact,  some  experienced  users  of  such  inter¬ 
face',  appear  to  discover  this  fact  on  their  own.  Consider  what  it  would  take  to  turn  a  "command 
language"  interface  into  a  "declaration  language"  interface.  The  difference  between  a  "command 


5  Of  course,  actioni  may  have  symbolic  meaning*,  but  theta  are  manning*  that  are  interpreted  by  other  lymbol  processing  device*, 
U,  people,  not  by  the  physical  world  in  which  they  an  enacted  and  in  which  they  may  have  fftysieal  consequences. 
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language"  and  a  "declaration  language”  interface  h  largely  in  die  mind  of  the  user.  If  the  user  panes 
"delate  too"  onto  the  deep  structure  corresponding  to  the  imperative  tarn  "(you)  delete  foo  (from  the 
system)"  or  "(!  command  you  to)  delete  too,"  thee  it  is  a  command  language  Interface  with  die  implicit 
imperative  "you”  as  the  implied  intermediary.  If  the  user  parses  "delate  too"  as  the  declaration  "(I 
hereby  declare)  too  (Meted,"  or  even  "(1  hereby  declare)  too  deleted  (torn  you,  the  system),*  then  it  is 
a  declaration  interface  with  no  implied  intermediary.  Of  course,  die  user  might  still  have  to  keep  track 
of  the  state  of  the  world  acted  upon,  since  k  might  not  be  explicitly  represented,  but  this  would 
nevertheless  be  a  declarative  interface.  Figure  4  shows  the  relation  of  user  to  world  of  action  under  the 
declaration  metaphor. 

Users  could  make  this  shift  on  their  own.  Some  occasionally  seem  to  do  so.  Consider  a  case  involv¬ 
ing  the  use  of  the  screen  editor  in  the  UNIX  operating  environment,  W.  The  command  rfw,  shorthand 
for  "delete  word,"  is  a  frequently  invoked  command  in  W.  Experienced  users  who  have  overteamed  the 
command  cease  regarding  it  as  a  instruction  to  an  agent  to  cany  out  on  the  text  file  and  instead  regard 
it  as  a  symbolic  Incantation  that  causes  the  word  to  the  right  of  the  cursor  to  disappear.  In  shifting  to 
the  declaration  metaphor,  these  users  have  eliminated  the  intermediary  between  themselves  and  the 
world  of  interest 

If  declaration  became  the  dominant  metaphor  for  an  interface,  the  user  could  become  a  magician  for 
whom  every  expression  hi  the  input  language  would  be  a  incantation  having  the  power  of  a  deduction 
The  magician  would  make  the  world  as  k  is  by  declaring  k  to  be  sol  Of  course,  the  power  of  such  a 
magician  would  not  lie  entirely  in  either  the  magician  or  the  language.  The  power  of  declarations  lies 
as  much  in  die  nature  of  die  world  dial  is  refund  to  as  in  the  utterances  that  do  the  referring.  Just  as 
decimations  in  natural  language  depend  upon  the  culturally  constructed  nature  of  the  world  to  which 
they  refer,  the  declarations  in  a  computer  interface  language  depend  upon  the  special  nature  of  the 
world  of  die  computer  system.  And,  of  course,  one  of  the  great  virtues  of  the  plasticity  of  the  computer 
as  a  medium  is  that  it  happens  so  be  a  world  in  which  raying  something  cm  make  it  so.  Hus  is  an 
important,  but  often  overlooked,  difference  between  most  uses  of  natural  language  and  computer  inter¬ 
face  languages.  Ik  is  a  difference  that  can  be  exploited  in  the  design  of  computer  interfaces  by  establish¬ 
ing  reference  relations  between  die  interface  language  end  the  world  to  which  it  refers  that  permit  the 
user  to  think  of  dm  interface  as  a  magical  world. 

Both  the  conversation  and  the  declaration  metaphors  are  implicit  options  for  die  user  with  respect  to 
most  so-called  conversational  interfaces,  but  few  of  those  who  are  known  for  their  computational  wizar¬ 
dry  see  themselves  as  magicians  of  the  declarative  sort  working  directly  on  die  world  rather  than  via  an 
intermediary.  I  believe  there  are  two  major  reasons  for  this.  First,  there  is  a  strong  historical  and  cul¬ 
tural  bias  in  favor  of  die  conversation  metaphor  and  against  the  declaration  parsing.  Considering  the 


FinURE  4.  Tha  Dedaretios  Interface.  Han  the  unr  performs  dedsretiuet.  ducriptiou  with  causal  force,  directly  in  the  world  of 
action  What  tha  uaar  obawves  it  aot  dear.  If  thugs  go  wall,  stats  chaagts  are  observed,  but  if  the  declaration  cannot  be  satisfied 
by  ths  world,  *9  stror  uMStagt  may  result  Such  u  error  message  destroys  tha  dsdaratioe  metaphor. 


uaer  input  tide  of  the  interaction,  the  conversation  pertint  it  implicitly  suggested  by  the  name  "com- 
maud  Unseat#,"  whan  command!  m  lamed  to  items,  and  by  the  popularity  of  the  "converaational" 
mtrynor  ro*  msfftct  otityi  in  |Mra«  moci  nporaHt  prapit  tnwfncti  inti  mm  om  ooi|m 
under  the  conventional  metaphor  frequently  bahavn  ta  ways  that  are  dURcrit  to  accomodate  fat  the 
declaration  mataphor.  For  example,  many  Inneftcee  take  advantage  of  the  implied  intermediary  at  e 
place  to  locate  error  massages.  Contider  an  arror  manege  stating  "delete-  Command  not  found"  or  "la: 
/hmchlmlfttot  Fsnrisskm  denied."  Thera  meeiagea  are  eeay  to  hnaaprat  aa  advice  from  an  intermediary 
who  hat  attempted  to  cany  out  the  oommand  but  hae  been  unable  to  do  so,  but  they  are  difficult  to 
inunpret  in  the  declarative  model  If  the  earn  fa  a  maglriaa  naming  incar  tatlont  with  cearai  force  to 
the  world,  who  la  raying  theae  things?  Who  couldn't  find  my  oommand?  Who  denial  permission? 
Than  am  aspects  of  the  system’ 4  behavior  that  an  not  captured  by  any  aspect  of  the  declaration  meta¬ 
phor,  and  fat  general,  the  declaration  metaphor  is  dtfflcuk  to  mehtteii  with  respect  to  any  interface  that 
produces  error  messages. 

If  v  wen  to  design  an  intwffcce  with  complete  fidelity  to  dm  declaration  mempbor,  then  when  a 
uaar  generated  an  mdriicitous,  semantically  wtomatoet,  or  entrant-deal  expression,  nothing  would 
happen.  After  all,  nothing  happens  when  a  meghden  —an  e  weanhtgleat  or  ineffective  spell.  So  the 
world  should  not  change  and  there  should  be  no  notification  of  »  problem.  Thai  reading  of  the  declara¬ 
tion  metaphor  would  rarely  lead  to  the  design  of  very  frustrating  interfaces.  A  better  solution  would  be 
to  prevent  the  magicianfaara  from  eve  uttering  such  a  sprit  But  how  can  that  be  done  without  invok¬ 
ing  an  intermediary  to  monitor  and  filter  the  user*!  utterances? 

As  it  stands,  declarations  have  the  power  to  directly  change  the  world,  but  nothing  rules  out  impos¬ 
sible  declarations.  If  saying  is  to  be  doing,  then  dura  must  be  some  way  of  ensuring  that  nothing  can 
be  said  that  cannot  be  done.  Otherwise,  some  intermediary  will  have  to  intervene,  and  that  destroys  the 
declaration  metaphor  fat  which  the  magician  does  by  saying.  The  declaration  metaphor,  in  which  "raying 
it  doing,"  can  only  be  supported  if  everything  diet  can  be  said  can  be  done.  Giving  the  declarations 
direct  causal  force  in  the  world  is  half  the  solution  to  the  problem  of  supporting  a  metaphor  for  more 
direct  action.  Constraining  the  production  of  declarations  it  the  other  half.  The  trouble  with  declara¬ 
tions,  however,  is  that  thay  are  linguistic  rarities.  They  are  inherently  symbolic,  and  they  exist 
independently  of  the  things  they  describe.  It  is  difficult  to  imagine  a  natural  way  to  constrain  die  pro¬ 
duction  of  declarations  such  that  only  those  things  that  an  possible  In  the  world  of  action  can  be 
described.  The  constraints  would  surely  appear  arbitrary  because  they  belong  to  the  domain  of  action, 
not  to  the  world  of  description  budding.  Still,  arbitrary  or  not,  these  constraints  are  sometimes  embo¬ 
died  in  the  interface,  as,  for  example,  in  the  use  of  dynamic  menus  that  only  present  options  that  are 
meaningful  In  the  current  task  environment 

The  declaration  metaphor  is  •  metaphor  that  half  wnrks.  It  is  not  quite  viable,  because  it  inevitably 
presents  the  user  either  with  opportunities  to  enter  situations  that  destroy  the  metaphor  itself  or  with 
what  seem  to  be  arbitrary  constraints  on  the  generation  of  declarations  to  be  enacted  upon  the  world. 


THE  MODEL-WORLD  METAPHOR 

The  model-world  metaphor  ran  became  supportable  at  virtually  all  levels  of  interaction  in  interfaces 
utilizing  currently  available  I/O  technologies.  The  two  requirements  for  die  maintenance  of  a  model- 
world  metaphor  are  thy.  expressions  in  the  interface  langaage  appear  as  actions  with  causal  force  in  die 
world  of  interest  and  that  the  generation  of  expressions  is  constrained  such  that  it  is  not  possible  to 
compose  an  expression  that  cannot  be  realized  in  die  world  of  interest  Figure  S  shows  the  relation  of 
user  to  world  of  action  under  the  model-world  metaphor. 
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FIGURE  5.  n>3  Model-World  Interface.  Hen  the  user  Uket  action  directly  ia  the  world  of  action  which  is  itself  the  medium  for 
the  interface  language.  The  user  directly  observes  stats  changes  in  the  world. 


Expressions  Wiih  Causal  Force 

In  a  system  built  on  the  model-world  metaphor,  the  interface  language  itself  can  be  seen  as  a  world 
where  the  user  can  act,  a  world  that  changes  state  in  response  to  user  actions.  The  world  of  interest  is 
explicitly  represented  and  then,  is  no  intermediary  between  user  and  world.  The  world  of  interest  is 
constructed  and  manipulated  by  expressions  in  die  interface  language  where  those  expressions  have  the 
character  of  actions  taken  in  the  world  of  interest.  This  collapse  of  description  to  action  closes  the 
reference  gap  between  die  expression  and  what  it  represents.  The  expression  becomes  what  it 
represents.  Giving  expressions  causal  force  in  die  world  of  interest  is  the  first  half  of  the  solution. 
This  is  the  basis  of  the  magic  in  die  declaration  metaphor. 

Note  that  giving  the  world  of  interest  explicit  representation  is  not  by  itself  sufficient  to  create  a 
model-world.  SHRDLU  (Winograd,  1972)  and  "Put  That  There"  (Bolt,  1980)  are  two  very  impressive 
systems  that  have  continuous  representation  of  the  world  of  interest  Yet  neither  is  a  model-world  since 
both  are  explicitly  conversational  in  nature.  The  expressions  generated  by  the  user  are  descriptions  to 
be  interpreted  by  an  intermediary.  In  fact  both  systems  were  designed  as  attempts  to  fill  the  conversa¬ 
tional  metaphor’s  design  vacuum.  These  systems  represent  an  advance  over  earlier  conversational  inter¬ 
faces  because  they  permit  a  different  sort  of  reference  than  is  possible  in  conversational  settings  where 
the  world  described  by  the  expressions  in  the  conversation  are  not  present  "Put  That  There”  is  expe- 
cially  interesting  because  it  demonstrates  the  integration  of  gesture  into  conversation.  Still,  the  gestures 
are  not  actions  in  the  world  of  interest,  but  are  instead  descriptors  to  be  interpreted  by  the  intermediary 
agent 


Constraining  the  Generation  of  Expressions 

Although  we  mostly  seem  to  overlook  it,  the  physical  world  has  a  wonderful  property.  In  the  physi¬ 
cal  world,  one  cannot  do  that  which  cannot  be  done.  When  we  consider  declarations  in  a  computer 
interface  language  as  analogous  to  actions  in  the  physical  world,  the  beauty  of  this  property  becomes 
apparent  The  constraints  of  the  world  are  manifest  in  our  interaction  with  the  world.  This  is  just  the 
property  wc  need  to  prevent  the  bumbling  user/magician  from  composing  an  impossible  expression. 
Thus,  one  soluticn  to  the  problem  of  the  generation  of  inappropriate  expressions  is  to  build  the  con¬ 
straints  of  the  world  referred  to  into  the  fw-’s  the  user  has  for  constructing  expressions  about  that  world. 
I  have  in  mind  a  special  sense  of  buildin  ■  the  constraints  of  the  domain  into  the  interface  language.  I 
do  not  mean  to  make  the  constraints  of  the  domain  syntactic  constraints  in  the  language.  Many 
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programming  languages  attempt  to  do  this  by  building  the  logic  of  the  programming  world  into  the  syn¬ 
tax  of  die  language.  For  example,  strict  typing  and  type  checking  in  some  programming  languages 
makes  it  a  syntax  error  to  do  a  floating  point  division  on  an  integer.  All  this  has  done  is  to  make  it  a 
syntax  violation  to  describe  in  the  intcrfaioe  language  that  which  is  not  possible  in  the  domain  of  action. 
That  is  just  what  we  don’  t  want  What  we  do  want  is  to  make  it  impossible  to  even  generate  a  descrip¬ 
tion  of  that  which  is  not  possible  in  the  domain  of  action.  If  we  do  that  we  can  collapse  the  reference 
relation  between  description  and  action  into  one  of  identity.  Generating  the  description  is  doing  the 
action. 

Consider  a  simple  truly  constraining  situation.  If  one  has  a  keyboard  with  a  certain  set  of  characters, 
then  one  is  cons  train  ted  to  type  only  those  characters  that  are  available.  This  is  the  only  built-in  con¬ 
straint  that  exists  on  most  "conversational"  interfaces.  Additional  constraints  might  be  built-in  by  fol¬ 
lowing  the  model  of  operational  interlocks  on  certain  devices.  Microwave  ovens,  for  example,  are 
designed  with  interlocks  that  prevent  starting  the  oven  with  the  door  open.  In  a  similar  way,  one  might 
imagine  building  a  constraint  for  English  text  entry  that  only  permitted  the  letter  u  to  be  typed  after  q. 
Once  having  typed  q,  die  only  key  that  would  generate  a  character  would  be  u.  All  others  would  signal 
an  error.  (As  silly  as  it  seems,  this  is  not  far  removed  from  the  nature  of  many  interfaces.)  A  better 
way  to  enforce  this  constraint  might  be  to  only  provide  the  qu  combination  as  a  pair  on  a  single  key. 
This  is  the  sense  in  which  I  intend  the  "building-in  of  constraint"  It  does  not  mean  that  the  user  is 
enjoined  from  taking  the  action,  or  that  an  error  will  be  detected  and  signa’ed  if  the  user  takes  that 
action.  It  means  instead  that  it  is  simply  not  possible,  using  die  tools  that  the  interface  language  pro¬ 
vides,  to  generate  an  expression  that  cannot  be  realized  in  die  world  of  action  to  which  the  expressions 
refer. 

Such  constraints  must  be  embodied  in  a  great  deal  of  structure,  and  making  that  structure  interpret¬ 
able  requires  a  good  domain  metaphor.  At  present  die  most  obvious  way  to  accomplish  this  is  to  build 
the  interface  language  as  a  model  of  a  physical  world.  Perhaps  there  is  some  small  set  of  fundamental 
constraints  that  must  be  met  in  order  to  support  the  model- world  conception.  Something  like  the 
existence  of  objects,  that  objects  do  not  change  unless  they  are  acted  upon,  that  actions  may  be  applied 
to  objects  that  exist  but  cannot  be  applied  to  objects  that  do  not  exist,  that  objects  that  exist  may  be 
seen,  that  objects  that  do  not  exist  cannot  be  seen,  and  so  on.4  These  constraints  on  the  generation  of 
input  expressions  are  the  basis  for  the  claims  by  proponents  of  "direct  manipulation"  that  error  mes¬ 
sages  are  not  required  in  these  systems.  The  key  here  again  is  in  the  reference  relations  between  the 
language  and  the  things  referred  to.  The  constraints  are  built  into  the  model  world,  which  serves  a  dual 
function  as  die  world  of  interest  and  as  the  medium  for  the  language  of  interaction.  This  is  the  other 
half  of  die  solution,  constraining  the  magician’s  language  so  that  only  meaningful  spells  can  be  uttered. 
This  is  what  keeps  the  magic  from  breaking,  what  prevents  the  model-world  metaphor  from  falling 
apart 

The  structure  that  is  present  in  die  interface  must  be  recognizable  by  the  user.  There  must  be  a 
coherent  scheme  for  die  operation  of  the  model-world,  one  that  makes  sense  so  that  the  limitations  on 
the  formation  of  expressions  is  unnoticed.  This  is  the  role  of  die  domain  metaphors.  Choosing  an 
appropriate  domain  metaphor  that  will  support  the  importation  of  useful  structure  to  the  task  at  hand  is 
critical  to  the  ease  of  use  of  such  systems.  Different  domain  metaphors  have  different  structures  that 
have  different  computational  properties  Each  way  of  conceiving  of  a  problem  may  make  some  things 
easy  to  see  and  other  things  difficult  to  see.  While  the  model-world  metaphor  eliminates  referential  dis¬ 
tance,  semantic  distance  remains  an  issue.  The  design  of  a  task  domain  metaphor  that  efficiently  cap¬ 
tures  users’  intentions  is  an  important  component  of  a  usable  model-world  interface. 

Of  course,  it  is  always  possible  to  view  an  interface  language  that  supports  the  model-world  meta¬ 
phor  as  a  medium  for  the  communication  between  a  user  and  an  intermediary.  While  both  interpreta¬ 
tions  are  available,  the  choice  between  them  makes  a  difference.  In  particular,  there  is  a  different  sort 
of  relationship  between  expressions  in  the  input  language  and  the  things  they  refer  to  in  the  two  cases. 


4  Of  count,  model  world*  need  not  simulate  the  propertiee  of  the  physical  world.  One  of  the  virtues  of  the  plasticity  of  the  com¬ 
puter  medium  it  that  worlds  can  exist  there  that  could  not  have  a  physical  reality. 
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Under  the  conversational  metaphor,  the  reference  relation  is  as  it  is  in  natural  language.  An  expression 
in  die  interface  language  is  a  symbolic  description  that  refeis  to  anions  and  objects.  Input  expressions 
are  inteipreted  by  the  intermediary  and  the  actions  are  carried  out  upon  the  objects.  Output  expressions 
are  interpreted  by  the  user  as  descriptions  of  the  system  state.  In  the  model-world  metaphor,  both  input 
and  output  expressions  appear  to  be  what  they  refer  to.  Expressing  the  action  and  doing  the  action  are 
experienced  as  the  same  thing. 

For  example,  consider  moving  an  icon  in  a  graphical  editor.  The  movements  of  the  mouse  and  the 
clicks  required  to  pick  up  the  icon  and  put  it  down  somewhere  else  constitute  a  complex  expression  in 
the  interface  language.  The  system  designer  can  see  this  as  an  expression  in  the  interface  language. 
But  for  the  user,  the  editor  presents  a  graphical  world  in  which  those  actions  that  comprise  the  expres¬ 
sion  in  the  interface  language  are  the  actions  to  be  taken  on  the  icon  object.  The  graphical  representa¬ 
tion  of  the  icon  is  an  expression  in  the  interface  output  language,  and  it  is  also  the  object  being  manipu¬ 
lated. 

Given  this  analysis  of  the  components  of  the  model-world  metaphor,  let  us  return  to  Schneiderman’s 
criteria  for  "direct  manipulation"  systems.  These  can  now  be  seen  as  descriptions  of  features  that  help 
support  the  model-world  metaphor.  His  requirements  of  continuous  representation  of  the  objects  of 
interest  and  immediate  response  are  elements  that  support  the  creation  of  the  world  itself. 
Schneiderman’s  notion  that  one  should  interact  with  the  system  via  "physical  actions  or  labeled  button 
presses  instead  of  complex  syntax"  seems  a  bit  confused,  but  is  clearly  on  the  right  track.  The  heart  of 
the  mattei  is  that  the  expressions  in  the  interface  language  (however  they  may  be  manifested)  must  be 
actions  in  the  world  of  interest  itself.  Schniederman’s  call  for  the  reversability  of  actions  is  not  an 
inherent  property  of  model  worlds  in  general.  Whether  it  should  or  should  not  be  a  property  of  the 
domain  metaphor  for  die  model  world  depends  upon  the  task,  hi  order  to  support  the  model-world 
metaphor,  the  world  must  be  continuously  represented  and  the  consequences  of  the  actions  must  be  as 
nearly  immediate  as  is  possible.  But  it  is  not  just  these  features,  it  is  the  reference  relations  that  are 
critical.  What  is  "direct"  about  direct  manipulation  is  the  collapse  of  description  into  action,  the  elimi¬ 
nation  of  the  reference  gap  between  the  expressions  in  the  interface  language  and  their  referents.  When 
we  make  the  interface  language  the  world  of  interest,  we  do  two  things.  First,  we  make  expressions 
into  actions.  This  collapses  the  reference  gap  and  banishes  the  implied  intermediary.  Second,  we  make 
the  constraints  of  the  world  of  interest  into  the  constraints  on  the  production  of  expressions.  This  pio- 
vides  a  natural  way  to  prevent  die  user  from  composing  an  expression  that  cannot  be  realized. 


Problems  in  a  Model  World 

Interfaces  built  on  the  model-world  metaphor  suffer  from  a  number  of  problems.  They  have  recently 
become  quite  popular  in  the  commercial  marketplace,  but  they  may  not  yet  have  come  up  against  their 
inherent  limitations. 

As  I  have  tried  to  demonstrate,  the  model  world  collapses  symbolic  reference  and  banishes  the 
intermediary  who  interprets  the  expressions  in  dm  interface  language.  Surely,  one  is  giving  up  some¬ 
thing  when  one  walks  away  from  several  millenia  of  progress  grounded  in  symbolic  reference.  Direct 
manipulation  schemes  have  always  been  vulnerable  to  criticisms  that  they  become  cumbersome  when 
applied  to  tasks  that  can  take  advantage  of  the  power  of  abstract  reference.  Suppose  I  want  to  perform 
some  action  on  every  word  in  this  paper  that  begins  with  the  letter  si  If  I  had  an  agent  that  understood 
symbolic  descriptions,  I  could  ask  it  to  find  all  such  instances  and  perform  the  desired  action  without 
knowing  in  advance  how  many  or  where  they  were.  If  I  were  dealing  with  a  model  world,  what  could  I 
do?  Would  I  have  to  find  every  instance  and  act  upon  it  in  person,  as  it  were?  One  way  around  this 
problem  to  acknowledge  that  the  description  specification  task  and  the  task  that  operates  on  instances 
are  at  different  levels  of  user  intention.  One  could  imagine  then  a  model  world  that  contains  as  its 
objects  elements  of  descriptions  and  operations.  The  user  could  then  operate  directly  in  that  world  to 
compose  the  desired  abstract  action  specification  to  be  mapped  across  the  instances  in  the  world  where 
action  is  ultimately  desired  (the  text  file,  for  example).  This  is  a  solution  that  preserves  the  model- 
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world  metaphor  at  a  superficial  level,  in  as  much  as  the  user  directly  constructs  the  abstract  description, 
but  what  shall  we  say  of  die  subsequent  application  of  that  description?  Is  that  not  the  action  of  a  new 
intermediary  that  the  user  has  brought  into  existence  via  action  in  the  model  world?  This  is  a  difficult 
question,  and  I  have  no  easy  answers. 

The  facts  that  the  conversation  and  model-world  metaphors  seem  to  be  capable  of  fading  into  each 
other  in  spite  of  their  fundamental  differences  and  that  they  can  be  combined  as  in  collaborative  mani¬ 
pulation  interfaces  raise  the  question  of  the  importance  of  maintaining  a  consistent  metaphor  throughout 
an  interaction.  I  take  this  to  be  essentially  an  empirical  question,  for  which  there  is  as  yet  no  answer 
that  I  know  of.  However,  it  seems  quite  reasonable  to  assume  that  these  metaphors  are  something  like 
points  of  view  on  the  interface,  and  there  is  empirical  evidence,  in  the  realm  of  text  comprehension  at 
least,  that  changes  in  point  of  view  can  interfere  with  the  comprehension  of  text  (Abelson,  1975;  Black, 
Turner,  &  Bower,  1979). 

Finally,  the  dictum  that  model  worlds  shall  provide  continuous  representation  of  the  objects  of 
interest  is  very  difficult  to  satisfy  in  worlds  of  even  moderate  complexity.  Screen  real  estate  is  quickly 
exhausted.  And  if  everything  of  interest  cannot  be  legible  presented  at  one  time,  then  measures  will 
have  to  be  taken  to  provide  for  display  control. 


THE  COLLABORATIVE  MANIPULATION  METAPHOR 

All  of  these  metaphors  are  inspired  by  ideas  about  the  nature  of  human  action  and  interaction  in  the 
absence  of  computers.  The  conversation  metaphor  is  based  on  the  assumption  that  the  computer  should 
be  an  actor  in  die  setting  in  which  it  works,  and  that  in  order  to  make  it  easy  far  humans  to  deal  with 
it,  it  should  behave  as  a  human  does  in  human-human  interaction.  A  conversation  or  dialogue  is  taken 
to  be  the  prototypic  human-human  interaction  mode,  so  the  computer  is  designed  to  support  a  conversa¬ 
tional  interaction.  The  model- world  metaphor  rests  on  die  assumption  that  one  of  the  things  that  people 
are  really  good  at  is  manipulating  bbjects  in  their  environment  The  activities  of  a  craftsman  may  be 
taken  as  the  prototype  for  the  development  of  such  interfaces.  The  fact  that  there  are  setting?  in  which 
conversation  coexists  with  the  manipulation  of  objects  in  the  world  suggests  that  these  two  metaphors 
might  be  productively  combined  in  die  design  of  computer  interfaces. 

For  the  past  several  years  I  have  been  studying  navigation  on  large  ships.  In  particular  I  have  been 
looking  at  the  activities  of  a  team  of  from  four  to  six  people  who  keep  track  of  a  ship’s  position  while 
it  is  entering  or  leaving  a  narrow  and  congested  harbor  (San  Diego).  In  this  world  of  navigation,  there 
are  many  structured  representational  media  that  are  manipulated  by  the  people  in  the  course  of  doing 
the  task.  These  include  the  navigation  chart,  plotting  tools,  measurement  tools,  written  records,  refer¬ 
ence  tables,  etc.  This  is  a  highly  evolved  (in  the  cultural  sense)  activity  and  some  of  the  representa¬ 
tional  media  have  beautiful  computational  properties.  For  example,  in  plotting  a  position,  a  representa¬ 
tional  state  is  imposed  on  a  plotting  device,  and  that  device  is  then  brought  into  coordination  with  the 
structure  of  the  nautical  chart  by  superimposing  it  upon  the  chart.  Because  of  the  structure  of  these 
representational  media,  a  complex  computation  can  be  realized  via  a  few  simple  alignment  procedures. 
But  the  fact  that  this  simple  superimposition  of  structure  does  get  the  right  answer  depends  critically 
upon  the  properties  of  the  plotting  tool  and  the  chart  itself,  which  are  artifacts  that  have  been  created  by 
people  who  are  not  present  at  the  occasion  of  their  use. 

Consider  the  relationship  between  the  cartographer  who  created  the  chart  and  the  navigator  who  uses 
it  as  one  kind  of  "collaborative  manipulation."  Every  time  someone  plots  a  position  on  the  chart,  it  is  a 
collaboration  with  the  cartographer.  Even  though  the  full  computation  is  distributed  across  space  and 
time  and  social  organization,  it  is  only  accomplished  by  the  cartographer  and  the  navigator  collabora- 
tively  manipulating  the  computational  artifacts  of  this  world.  The  cartographer  could  not  anticipate 
where  on  the  chart  a  ship  might  be.  but  had  strong  expectations  about  the  nature  of  the  procedures  that 
would  be  used  to  plot  the  position  and  constructed  the  chart  in  such  a  way  that  those  procedures  would 
in  fact  work. 
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There  is  a  more  immediate  sense  of  collaborative  manipulation  in  the  concurrent  joint  activities  of 
the  members  of  the  navigation  team.  While  there  U  a  nominal  division  of  labor  among  die  team 
members,  several  of  them  are  co-located  in  a  shared  space  with  shared  access  to  several  of  the 
representational  technologies.  In  the  process  of  computing  the  ship's  location,  they  collaborate  in  the 
manipulation  of  the  representational  artifacts.  Two  people  may  work  together  to  align  a  plotting  tool 
for  a  line  of  position  on  die  chart,  or  one  person  may  anticipate  the  needs  of  another  and  manipulate  a 
medium  to  put  it  in  a  state  from  which  the  other  can  proceed  more  easily.  Sometimes  they  achieve 
coordination  with  each  other  by  manipulating  the  structure  of  the  representational  artifacts  in  their 
environment;  sometimes  they  manipulate  the  structure  of  sound  waves  in  the  air  in  their  environment; 
sometimes  they  gesture  and  touch  each  other. 

Here  we  have  two  instances  of  "collaborative  manipulation"  in  a  real-world  task  setting.  How  might 
they  be  mapped  into  the  design  of  a  computer  interface?  Well,  consider  the  situation  of  any  of  the  peo¬ 
ple  in  the  navigation  setting.  The  environment  contains  artifacts  and  other  humans.  This  person 
converses  with  the  other  people,  and  manipulates  the  objects  in  the  environment  But  the  other  people 
are  manipulating  those  objects  as  well,  and  sometimes  the  communication  among  the  people  is  con¬ 
ducted  via  die  manipulation  of  those  objects.  This  suggests  a  system  that  contains  both  a  model-world 
and  an  intelligent  agent  The  user  should  be  able  to  have  a  conversation  about  the  world  with  the 
agent  and  both  the  user  and  the  agent  should  be  able  to  manipulate  the  shared  world.  Figure  6  shows 
the  relation  of  user  to  agent  and  world  of  action  under  the  collaborative  manipulation  metaphor. 


Command  Completion 

As  a  very  simple  example,  consider  command  completion,  a  feature  that  has  been  around  for  a  long 
time  in  some  systems.  A  command  language  interface  is  "conversational"  in  the  sense  that  the  user 
provides  descriptions  of  actions  to  be  taken  by  an  intermediary  in  some  world.  At  the  level  of  task  per¬ 
formance,  therefore,  the  interface  is  not  a  model  world.  At  the  level  of  the  specification  of  the  charac¬ 
ter  strings  that  constitute  the  commands,  however,  it  is  usually  experienced  as  a  model  world.  The  user 
takes  actions  (presses  keys)  and  sees  the  consequences  immediately.  Command  completion  facilities 
are  a  way  for  the  interface  itself  to  anticipate,  on  the  basis  of  partial  input,  what  the  user  intends,  and  to 
use  that  anticipation  to  collaboratively  manipulate  the  world  that  the  user  is  manipulating.  Typically, 
the  use  types  a  few  characters  of  a  command,  then  types  <space>  to  signal  the  collaborator  that  it 
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FIGURE  6.  The  Collaborative  Manipulation  Interface.  Thii  ia  a  combination  c f  the  conversation  and  model-world  interface*. 
Here  the  uier  may  interact  with  an  inttrmediwy  that  can  act  upon  the  the  world  of  action,  or  the  user  may  act  upon  that  world 
directly. 
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should  attempt  to  type  the  remaining  characters.3  We  can  consider  this  activity  from  the  point  of  view 
of  each  of  the  metaphors.  Seen  via  the  conversational  metaphor,  the  computer  completes  one's  utter¬ 
ance,  just  as  a  good  conversational  partner  might  do.  From  the  model-world  perspective,  the  user  and 
machine  are  engaged  in  a  collaborative  manipulation  of  the  user's  input.  But  notice  how  this  last  point 
reflects  back  onto  the  human-human  conversational  setting.  An  important  aspect  of  conversation  is  that 
it  is  collaborative  manipulation  of  the  expressions  in  the  speech  channel.  When  the  type  of  "doing"  we 
are  concerned  with  is  "saying,"  then  "saying  is  doing."  It  sounds  silly,  but  it  is  simply  another  instance 
of  the  collapse  of  the  reference  gap.  hi  the  same  way  that  a  conversational  interface  normally  gives  a 
direct  manipulation  interface  to  the  task  of  producing  character  strings,  so  speech  gives  us  a  direct 
manipulation  interface  to  the  production  of  phonetic  sequences. 

Two  ex  tendons  to  the  Steamer  system  have  been  built  cm  a  collaborative  manipulation  metaphor.6 
One  is  an  intelligent  display  controller  for  process  monitoring  situations  (McCandless,  1986),  and  the 
other  is  an  intelligent  knowledge-basod  graphic  designer's  aid  (Weitzman,  1986).  I  discuss  these  below. 

Display  controller.  Displays  of  the  type  that  can  be  easily  created  in  Steamer  can  be  connected  to 
real-time  processes  as  well  as  to  simulation  models.  In  typical  applications  users  can  choose  which 
display  they  would  like  to  attend  to  at  any  point  in  time.  One  of  the  problems  in  real-time  process 
monitoring  is  that  the  operator  forms  hypotheses  about  the  state  of  die  process  and  may  subsequently 
search  for  information  that  confirms  the  hypothesis  while  disregarding  evidence  that  conflicts  with  the 
hypothesis.  One  way  to  solve  this  problem  is  to  have  another  mind  present  with  other  hypotheses. 
Such  a  "doubting  Thomas"  may  point  to  other  information.  Our  group  at  UCSD  has  implemented  an 
intelligent  display  controller  that  selects  displays  and  display  components  based  upon  the  "importance" 
of  the  process  variables  that  are  indicated  by  tire  display  components.  The  process  variables  themselves 
know  when  they  are  in  or  out  of  their  normal  operating  ranges,  for  example,  and  the  display  controller 
can  give  priority  to  display  components  that  report  the  values  of  variables  that  are  out  of  range.  In  fact, 
the  controller  is  implemented  as  a  parallel  distributed  processing  network  that  is  capable  of  learning 
trends  in  values  that  precede  "important"  events,  so  it  can  anticipate  states  of  the  process  and  can  give 
variables  that  are  moving  in  a  direction  that  is  ominous  in  the  current  context  display  priority  before 
their  values  actually  become  alarming.  What  tire  display  controller  presents  the  operator  with  a 
display,  the  operator  may  reject  display  components,  indicating  that  they  are  not  relevant  in  the  current 
context7  The  display  controller  then  learns  about  the  operator’s  preferences  in  the  same  way  it  learned 
about  the  system's  behavior  by  observation.  In  this  system,  tire  display  is  the  shared  world  of  action. 
The  contents  of  the  display  are  collaboratively  manipulated  by  tire  operator  and  the  display  controller. 

Graphic  design  aid.  The  graphics  editor  that  was  developed  in  connection  with  the  Steamer  project 
permits  subject  matter  experts  with  no  computing  expertise  to  generate  diagrams  (which  are  actually 
complex  lisp  programs)  simply  by  assembling  them  in  a  model-world  environment  These  subject 
matter  experts  are  seldom  expert  graphic  designers,  so  the  diagrams  they  create,  while  capturing  some¬ 
thing  of  the  subject  matter  expert’s  expertise,  may  be  of  poor  graphic  design  quality  and  may  not  be 
stylistically  similar  to  each  other.  Designer  is  an  expert  system  that  shares  the  diagram  with  the  user  as 
a  model  world  for  action.  The  user  can  have  th"  designer  system  analyze  the  diagram.  Designer  will 
find  violations  of  design  principles  and  notify  the  user.  Furthermore,  the  user  can  ask  the  system  to 
demonstrate  ways  to  correct  the  violations.  Demonstration  is  an  important  interface  event  because  it 


3  Although  it  Menu  to  share  eonw  features  with  command  completion,  the  Do  Whet  I  Mein  (DWIM)  facility  (Teitelnun,  1974)  in 
Inleriisp  doea  not  belong  here.  DWIM  frequently  limply  make*  the  mod  likely  interpretation  of  the  user's  input  and  executes  that 
without  notifying  the  uaer  that  it  ia  doing  so.  DWIM  is  an  intelligent  agent,  but  the  input  expression  itself  ia  never  object  of  dis- 
cuialon,  so  there  ia  no  ahared  world  of  action. 

*  At  the  time  these  systems  were  designed,  collaborative  manipulation  was  not  put  of  our  vocabulaty  in  the  laboratory,  but  the 
ideas  that  term  refen  to  were  dearly  present 

7  This  does  admit  the  possibility  of  the  operator  peneverating  on  a  faulty  interpretation. 
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implies  collaborative  manipulation.  Hie  agent  performing  the  demonstration  meat  have  direct  access  to 
the  world,  and  the  actions  performed  in  the  demonstration  are  the  content  of  the  communication  to  the 
other  agent.  The  ability  of  the  agent  who  receives  the  demonstration  to  act  in  that  world  is  a  presuppo¬ 
sition  of  the  demonstration  act 


DISCUSSION 

Looking  back  across  the  several  metaphors,  we  can  see  relationships  between  the  nature  of  the  tech¬ 
nology  available  and  the  metaphors  for  interaction.  In  die  case  of  the  teletype,  we  could  see  that  tech¬ 
nology  can  suggest  metaphors,  or  at  least  constrain  the  sorts  of  mode  of  interaction  metaphors  that  are 
supportable.  Teletype  technology  supports  a  conversation,  figuratively  speaking,  between  user  and 
machine,  while  high-resolution  bitmaps  and  point  devices  suggest  model  worlds.  But  the  metaphor  can 
also  constrain  the  possibilities  we  see  in  the  technology.  The  conversation  metaphor,  in  its  narrow 
sense,  steers  us  away  from  the  declaration  metaphor  by  emphasizing  the  presence  of  an  intermediary. 
The  declaration  metaphor  is  an  example  of  a  change  in  die  power  of  the  interface  that  is  brought  about 
not  by  a  change  in  technology,  but  by  a  change  in  interface  metaphor.  When  users  discover  the 
declaration  metaphor,  they  are  discovering  a  mode  of  interaction  that  is  possible  in  the  technology  of 
the  interface  but  which  is  not  seen  under  the  conversation  metaphor. 

The  choice  of  a  mode  of  interaction  metaphor  can  make  great  differences  in  the  power  of  an  inter¬ 
face.  We  are  often  not  aware  of  having  chosen  a  particular  metaphor,  and  do  not  often  consider  the 
options  available  and  their  computational  properties.  In  this  paper  I  have  argued  for  the  viability  of  two 
metaphors  in  addition  to  the  conversational  metaphor  the  model  world  and  a  hybrid,  collaborative 
manipulation.  The  key  to  the  properties  of  the  interface  lies  in  the  reference  relations  between  the 
expressions  in  d.e  interface  language  and  die  things  to  which  they  refer.  There  are  advantages  in  die 
abstractness  and  the  ambiguity  of  symbolic  descriptions.  There  ire  also  gains  to  be  had  in  taking 
advantage  of  the  magical  character  of  the  worlds  that  exist  on  computers.  They  can  be  designed  in 
such  a  way  that  "saying  is  doing,"  and  this  can  be  exploited  to  give  the  user  great  ease  of  interaction. 
Supporting  that  ease  of  interaction,  however,  leads  to  limitations  on  the  language  that  may  prevent  it 
reaching  the  power  of  the  symbolic  description  mode  of  interaction. 

The  issue  is  clearly  not  a  question  of  which  metaphor  is  the  "best"  I  only  hope  we  can  recognize 
that  metaphors  are  present  at  all  stages  of  interface  design  and  use  and  that  they  have  important  conse¬ 
quences.  I  also  hope  we  can  realize  that  we  have,  in  some  sense,  been  captured  by  one  of  several  pos¬ 
sible  metaphors.  My  reasons  for  hoping  we  can  come  to  this  vision  are,  in  fact,  my  reservations  about 
the  assumptions  underlying  the  conversation  metaphor.  First,  taking  the  problem  of  human-computer 
interaction  to  be  a  communications!  problem  assumes  that  the  computer  will  be  another  intelligent 
agent  rather  than  a  tool  or  a  structured  medium  that  the  user  can  manipulate.  It  may  be  that  computers 
will  have  an  important  role  as  agents,  but  it  is  certain  that  they  will  be  a  vital  class  of  tool.  Communi¬ 
cation  should  not  be  the  only  organizing  metaphor  for  human-computer  interaction.  Second,  assuming 
that  human-human  communication  is  acheived  primarily  via  conversation  removed  from  the  objects 
referred  to  may  be  a  mistake.  In  face-to-face  conversation,  a  world  is  present  that  may  contain  objects 
or  events  to  which  the  conversation  refers.  This  makes  reference  different  in  that  one  can  refer  to  a  seen 
world,  and  it  means  that  other  modes  of  communication  beside  speech  are  available,  e.g„  demonstra¬ 
tion.  Looking  at  the  interactions  of  individuals  in  a  highly  evolved  real-world  task  setting  we  see 
conversation,  but  we  also  see  the  collaborative  manipulation  of  representational  media.  Conversation  is 
good  when  the  nature  of  Jie  task  needs  to  be  negotiated  or  the  division  of  labor  is  not  specified,  but 
when  the  task  is  well  understood,  little  conversation  needs  to  take  place.  In  highly  evolved  task  set¬ 
tings,  a  good  deal  of  the  expertise  of  the  system  as  a  whole  is  in  the  structure  of  the  artifacts  rather 
than  in  the  people  themselves.  Third,  the  skills  that  people  have  dealing  with  each  other  are  adaptations 
to  the  limitations  of  people.  It  may  be  that  a  computer  could  be  even  easier  for  a  person  to  deal  with 
than  another  person  would  be.  Seeking  to  imitate  human  behavior  with  computers  that  are  to  have 
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roles  in  task  performances  may  be  setting  the  wrong  sort  of  standard  of  performance.  This  criticism 
applies  to  all  of  the  interface  metaphors  discussed  in  this  paper,  since  all  of  them  are  based  on  map¬ 
pings  from  interaction  with  nonco-irutational  systems.  Because  computers  can  manifest  behaviors  that 
are  not  possible  in  any  other  median .  we  should  use  our  imaginations  in  the  design  process.  Perhaps  as 
technology  develops,  we  will  be  able  to  think  of  the  human  aa  the  limited  partner  in  the  interaction  and 
design,  not  another  human,  but  rn  environment  that  complements  the  abilities  of  human  users. 

I  take  these  caveats  as  reminders  that  the  space  of  interfaces  is  larger  than  we  hive  assumed  and  that 
it  may  be  larger  than  we  can  presently  Imagine.  Given  the  power  of  metaphors  to  change  the 
phenomenological  feel  of  interfaces  and  the  influence  of  model  of  interaction  metaphors  on  the  direc¬ 
tion  of  development  of  technology,  we,  as  designers,  have  e  responsibility  to  give  careful  consideration 
to  the  metaphors  we  uss. 
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